Skip to content

Serialize models with exclude_unset, not exclude_defaults#208

Merged
negz merged 2 commits into
crossplane:mainfrom
negz:best-left-unset
Jun 8, 2026
Merged

Serialize models with exclude_unset, not exclude_defaults#208
negz merged 2 commits into
crossplane:mainfrom
negz:best-left-unset

Conversation

@negz

@negz negz commented Jun 4, 2026

Copy link
Copy Markdown
Member

Fixes #207
Fixes #210

resource.update and resource.update_status serialized Pydantic models with model_dump(exclude_defaults=True). This PR makes two related corrections to how those functions serialize models.

exclude_defaultsexclude_unset

exclude_defaults asks "is this field different from its default?". The correct question for server-side apply is "did the caller set this field?", which exclude_unset answers. exclude_defaults also regressed with newer datamodel-code-generator, which emits object defaults as a raw dict with validate_default=True; the validated instance doesn't compare equal to the declared dict default, so unset fields like spec.providerConfigRef leaked into every composed resource. exclude_unset is immune to how a default is represented (see crossplane/cli#64 (comment)) — a field the caller didn't touch is absent from model_fields_set.

Add by_alias=True

Neither serialization passed by_alias=True, so a field carrying a Pydantic alias was emitted under its Python attribute name rather than its alias — the resource's real wire name. datamodel-code-generator can't name a field bool, int, from, continue, or schema, so it emits a bool_ attribute aliased to bool, and so on. A function composing such a field wrote bool_: true instead of bool: true, which doesn't match the resource's schema, so the API server rejects it or silently drops it. This surfaced with Kubernetes Dynamic Resource Allocation, whose device attribute value is a one-of over string, version, bool, and int.

Passing by_alias=True is a no-op for ordinary fields and corrects only the keyword-collision cases. It's also symmetric with how pydantic deserializes these models: by default an aliased field is populated only by its alias, so reads and writes now both speak wire names. It's orthogonal to the exclude_unset change: one decides which fields are emitted, the other how they're named.

I have:

@negz negz requested a review from bobh66 as a code owner June 4, 2026 00:33
@negz negz force-pushed the best-left-unset branch from 8b04926 to fa72dd0 Compare June 4, 2026 00:39
@negz

negz commented Jun 4, 2026

Copy link
Copy Markdown
Member Author

I just pinned a fairly huge Python function project to use this commit and I'm not seeing anything concerning. As expected my tests need a few updates for cases where the function code explicitly set a field to its defaults. Before this change those fields would be omitted on serialization - now they're emitted.

@bobh66 bobh66 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

negz added 2 commits June 8, 2026 13:38
resource.update and update_status serialized Pydantic models with
model_dump(exclude_defaults=True), which asks "is this field different
from its default?". The correct question for server-side apply is "did
the caller set this field?", which exclude_unset answers.

exclude_defaults also regressed with newer datamodel-code-generator. It
emits object defaults as a raw dict with validate_default=True instead
of a default_factory. The default is validated into a model instance at
construction, which doesn't compare equal to the declared dict default,
so exclude_defaults fails to exclude it. Unset fields like
spec.providerConfigRef then leaked into every composed resource.

exclude_unset is immune to how a default is represented: a field the
caller didn't touch is absent from model_fields_set. It also keeps
fields the caller explicitly set to their default value, which is more
correct under server-side apply, where setting a field claims ownership
of it.

The apiVersion and kind workaround stays. Functions build models with
kwargs and rarely pass these, so they're unset and excluded either way.

See crossplane#207 for
more detail.

Signed-off-by: Nic Cope <nicc@rk0n.org>
resource.update and resource.update_status serialize Pydantic models
with model_dump(exclude_unset=True), which doesn't pass by_alias=True.
A model field that carries a Pydantic alias is then serialized under its
Python attribute name rather than its alias, which is the resource's real
wire name.

This bites fields whose KRM name is a Python keyword or builtin.
datamodel-code-generator can't name a field bool, int, from, continue, or
schema, so it emits a bool_ attribute aliased to bool, int_ aliased to
int, and so on. When a function composes a resource that sets such a
field, update wrote the Python name into the desired resource:

  data = source.model_dump(exclude_unset=True)
  # -> {"bool_": True}, but the resource's field is "bool"

The composed resource then carried bool_: true instead of bool: true,
which doesn't match the resource's schema. The API server rejects it or
silently drops the unknown field, and the field the function set never
takes effect. This surfaced with Kubernetes Dynamic Resource Allocation,
whose device attribute value is a one-of over string, version, bool, and
int.

This change passes by_alias=True alongside exclude_unset=True in both
functions. It's a no-op for ordinary fields, which have no alias, and
corrects only the keyword-collision cases. It's also symmetric with how
pydantic deserializes these models: by default an aliased field is
populated only by its alias, so reads and writes now both speak wire
names.

Fixes crossplane#210.

Signed-off-by: Nic Cope <nicc@rk0n.org>
@negz negz force-pushed the best-left-unset branch from f139587 to d18cbbb Compare June 8, 2026 20:38
@negz negz merged commit ad6f1e1 into crossplane:main Jun 8, 2026
6 checks passed
@negz negz deleted the best-left-unset branch June 8, 2026 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

resource.update doesn't pass by_alias, mangling keyword-named fields Serialize models with exclude_unset, not exclude_defaults

2 participants